Repeat Yourself  重复自己

One of the most repeated pieces of advice throughout my career in software has been “don’t repeat yourself,” also known as the DRY principle. For the longest time, I took that at face value, never questioning its validity.
在我的软件职业生涯中,“不要重复自己”,也称为 DRY 原则,是最常被重复的建议之一。很长一段时间里,我都是字面理解这句话,从未质疑过它的有效性。

That was until I saw actual experts write code: they copy code all the time1. I realized that repeating yourself has a few great benefits.
直到我看到真正的专家写代码:他们经常复制代码 1 。我意识到重复自己有几个很大的好处。

Why People Love DRY  为什么人们喜欢 DRY

The common wisdom is that if you repeat yourself, you have to fix the same bug in multiple places, but if you have a shared abstraction, you only have to fix it once.
普遍的观点是,如果你重复自己,你需要在多个地方修复同一个 bug,但如果你有一个共享的抽象,你只需要修复一次。

Another reason why we avoid repetition is that it makes us feel clever. “Look, I know all of these smart ways to avoid repetition! I know how to use interfaces, generics, higher-order functions, and inheritance!”
我们避免重复的另一个原因是这让我们感觉聪明。“看,我知道所有这些避免重复的聪明方法!我知道如何使用接口、泛型、高阶函数和继承!”

Both reasons are misguided. There are many benefits of repeating yourself that might get us closer to our goals in the long run.
这两种原因都是错误的。重复自己有很多好处,从长远来看可能会让我们更接近目标。

Keeping Up The Momentum  保持势头

When you’re writing code, you want to keep the momentum going to get into a flow state. If you constantly pause to design the perfect abstraction, it’s easy to lose momentum.
当你编写代码时,想要保持势头进入心流状态。如果你不断停下来设计完美的抽象,很容易失去势头。

Instead, if you allow yourself to copy-paste code, you keep your train of thought going and work on the problem at hand. You don’t introduce another problem of trying to find the right abstraction at the same time.
相反,如果你允许自己复制粘贴代码,你就能保持思路的连贯性,并着手解决问题。你不会同时引入另一个问题,即寻找正确的抽象。

It’s often easier to copy existing code and modify it until it becomes too much of a burden, at which point you can go and refactor it.
通常,复制现有代码并修改它,直到它变得过于负担,然后你可以去重构它,这样更容易。

I would argue that “writing mode” and “refactoring mode” are two different modes of programming. During writing mode, you want to focus on getting the idea down and stop your inner critic, which keeps telling you that your code sucks. During refactoring mode, you take the opposite role: that of the critic. You look for ways to improve the code by finding the right abstractions, removing duplication, and improving readability.
我认为“编写模式”和“重构模式”是两种不同的编程模式。在编写模式下,你希望专注于把想法实现出来,并停止内心那个不断告诉你代码很烂的批评者。在重构模式下,你扮演相反的角色——那个批评者。你通过寻找合适的抽象、消除重复和提升可读性来寻找改进代码的方法。

Keep these two modes separate. Don’t try to do both at the same time.2
保持这两种模式分开。不要试图同时做两者。 2

Finding The Right Abstraction Is Hard
找到合适的抽象很难

When you start to write code, you don’t know the right abstraction just yet. But if you copy code, the right abstraction reveals itself; it’s too tedious to copy the same code over and over again, at which point you start to look for ways to abstract it away. For me, this typically happens after the first copy of the same code, but I try to resist the urge until the 2nd or 3rd copy.
当你开始编写代码时,你还不清楚合适的抽象是什么。但如果你复制代码,合适的抽象就会显现出来;反复复制相同的代码太麻烦了,这时你就会开始寻找抽象它的方法。对我来说,这通常发生在第一次复制相同代码之后,但我会尝试抵制这种冲动,直到第二次或第三次复制。

If you start too early, you might end up with a bad abstraction that doesn’t fit the problem. You know it’s wrong because it feels clunky. Some typical symptoms include:
如果你开始得太早,你可能会最终得到一个不适合问题的糟糕抽象。你知道它是错的,因为它感觉笨拙。一些典型症状包括:

  • Generic names that don’t convey intent, e.g., render_pdf_file instead of generate_invoice
    无法传达意图的通用名称,例如, render_pdf_file 而不是 generate_invoice
  • Difficult to understand without additional context
    没有额外的上下文很难理解
  • The abstraction is only used in one or two places
    这个抽象只在一个或两个地方使用
  • Tight coupling to implementation details
    与实现细节紧密耦合

It’s Hard To Get Rid Of Wrong Abstractions
难以摆脱错误的抽象

We easily settle for the first abstraction that comes to mind, but most often, it’s not the right one. And removing the wrong abstraction is hard work, because now the data flow depends on it.
我们很容易接受脑海中第一个出现的抽象,但大多数情况下,它并不是正确的。而移除错误的抽象是一项艰巨的工作,因为现在数据流依赖于它。

We also tend to fall in love with our own abstractions because they took time and effort to create. This makes us reluctant to discard them even when they no longer fit the problem—it’s a sunk cost fallacy.
我们还倾向于爱上自己的抽象,因为它们花费了时间和精力来创建。这使得我们在它们不再适合问题时也不愿意抛弃它们——这是一种沉没成本谬误。

It gets worse when other programmers start to depend on it, too. Then you have to be careful about changing it, because it might break other parts of the codebase. Once you introduce an abstraction, you have to work with it for a long time, sometimes forever.
当其他程序员也开始依赖它时,情况会更糟。那时,你必须小心地修改它,因为可能会破坏代码库的其他部分。一旦你引入了抽象,就必须长期与之打交道,有时甚至永远。

If you had a copy of the code instead, you could just change it in one place without worrying about breaking anything else.
如果你有代码的副本,只需在单处修改,无需担心破坏其他部分。

Duplication is far cheaper than the wrong abstraction
重复远比错误的抽象便宜

—Sandi Metz, The Wrong Abstraction
—Sandi Metz,《错误的抽象》

Better to wait until the last moment to settle on the abstraction, when you have a solid understanding of the problem space.3
最好等到最后一刻才确定抽象,那时你对问题空间已有充分理解。 3

The Mental Overhead of Abstractions
抽象的心智负担

Abstraction reduces code duplication, but it comes at a cost.
抽象减少了代码重复,但这是以代价为代价的。

Abstractions can make code harder to read, understand, and maintain because you have to jump between multiple levels of indirection to understand what the code does. The abstraction might live in different files, modules, or libraries.
抽象可能会使代码更难阅读、理解和维护,因为你必须跳转多个间接层级才能明白代码的作用。抽象可能存在于不同的文件、模块或库中。

The cost of traversing these layers is high. An expert programmer might be able to keep a few levels of abstraction in their head, but we all have a limited context window (which depends on familiarity with the codebase).
遍历这些层级的成本很高。一个专家程序员可能能在脑海中保持几个抽象层级,但我们都有一个有限的上下文窗口(这取决于对代码库的熟悉程度)。

When you copy code, you can keep all the logic in one place. You can just read the whole thing and understand what it does.
当你复制代码时,可以将所有逻辑集中在一个地方。你只需通读一遍就能明白它做什么。

Resist The Urge Of Premature Abstraction
抵制过早抽象的冲动

Sometimes, code looks similar but serves different purposes.
有时,代码看起来相似,但服务于不同的目的。

For example, consider two pieces of code that calculate a sum by iterating over a collection of items.
例如,考虑两个计算总和的代码片段,它们通过遍历一个项目集合来实现。

total = 0
for item in shopping_cart:
    total += item.price * item.quantity

And elsewhere in the code, we have
在代码的其他地方,我们有

total = 0
for item in package_items:
    total += item.weight * item.rate

In both cases, we iterate over a collection and calculate a total. You might be tempted to introduce a helper function, but the two calculations are very different.
在两种情况下,我们都遍历一个集合并计算总和。你可能会想引入一个辅助函数,但这两个计算非常不同。

After a few iterations, these two pieces of code might evolve in different directions:
经过几次迭代,这两段代码可能会朝着不同的方向发展:

def calculate_total_price(shopping_cart):
    if not shopping_cart:
        raise ValueError("Shopping cart cannot be empty")
    
    total = 0.0
    for item in shopping_cart:
        # Round for financial precision
        total += round(item.price * item.quantity, 2)
    
    return total

In contrast, the shipping cost calculation might look like this:
相比之下,运费计算可能看起来像这样:

def calculate_shipping_cost(package_items, destination_zone):
    # Use higher of actual weight vs dimensional weight
    total_weight = sum(item.weight for item in package_items)
    total_volume = sum(item.length * item.width * item.height for item in package_items)
    dimensional_weight = total_volume / 5000  # FedEx formula
    
    billable_weight = max(total_weight, dimensional_weight)
    return billable_weight * shipping_rates[destination_zone]

Had we applied “don’t repeat yourself” too early, we would have lost the context and specific requirements of each calculation.
如果我们过早应用“不要重复自己”原则,我们将失去每个计算的具体上下文和需求。

DRY Can Introduce Complexity
DRY 可能引入复杂性

The DRY principle is misinterpreted as a blanket rule to avoid any duplication at all costs, which can lead to complexity.
DRY 原则被误解为不惜一切代价避免任何重复的笼统规则,这会导致复杂性。

When you try to avoid repetition by introducing abstractions, you have to deal with all the edge cases in a place far away from the actual business logic. You end up adding redundant checks and conditions to the abstraction, just to make sure it works in all cases. Later on, you might forget the reasoning behind those checks, but you keep them around “just in case” because you don’t want to break any callers. The result is dead code that adds complexity to the codebase; all because you wanted to avoid repeating yourself.
当你试图通过引入抽象来避免重复时,你必须在远离实际业务逻辑的地方处理所有边界情况。你最终在抽象中添加了冗余的检查和条件,以确保它在所有情况下都能正常工作。后来,你可能忘记了这些检查背后的原因,但你仍然保留它们“以防万一”,因为你不想破坏任何调用者。结果是产生了死代码,增加了代码库的复杂性;所有这一切都因为你想要避免重复。

The common wisdom is that if you repeat yourself, you have to fix the same bug in multiple places. But the assumption is that the bug exists in all copies. In reality, each copy might have evolved in different ways, and the bug might only exist in one of them.
普遍的观点是,如果你重复自己,你必须在多个地方修复同一个错误。但这种假设是错误存在的所有副本中。实际上,每个副本可能以不同的方式进化,而错误可能只存在于其中之一。

When you create a shared abstraction, a bug in that abstraction breaks every caller, breaking multiple features at once. With duplicated code, a bug is isolated to just one specific use case.
当你创建一个共享抽象时,该抽象中的错误会破坏所有调用者,从而同时破坏多个功能。而重复的代码,错误只会局限于一个特定的使用场景。

Clean Up Afterwards  事后清理

Knowing that you didn’t break anything in a shared abstraction is much harder than checking a single copy of the code. Of course, if you have a lot of copies, there is a risk of forgetting to fix all of them.
知道你没有在共享抽象中破坏任何东西,这比检查单个代码副本要困难得多。当然,如果你有很多副本,就有忘记修复所有副本的风险。

The key to making this work is to clean up afterwards. This can happen before you commit the code or during a code review.
让这件事奏效的关键在于事后清理。这可以在你提交代码之前或在进行代码评审时进行。

At this stage, you can look at the code you copied and see if it makes sense to keep it as is or if you can see the right abstraction. I try to refactor code once I have a better understanding of the problem, but not earlier.
在这个阶段,你可以看看你复制的代码,判断是否应该保持原样,或者是否能看到正确的抽象。我通常在更好地理解问题后重构代码,但不会更早。

A trick to undo a bad abstraction is to inline the code back into the places where it was used. For a while, you end up “repeating yourself” again in the codebase, but that’s okay. Rethink the problem based on the new information you have. Often you’ll find a better abstraction that fits the problem better.
撤销一个糟糕的抽象的一个技巧是将代码重新内联到它被使用的地方。一段时间内,你会在代码库中再次“重复自己”,但这没关系。根据你获得的新信息重新思考问题。通常你会找到一个更适合问题的更好抽象。

When the abstraction is wrong, the fastest way forward is back.
当抽象错误时,前进最快的方式是后退。

—Sandi Metz, The Wrong Abstraction
—Sandi Metz,《错误的抽象》

tl;dr  简而言之

It’s fine to look for the right abstraction, but don’t obsess over it. Don’t be afraid to copy code when it helps you keep momentum and find the right abstraction.
寻找合适的抽象是没问题的,但不要过分纠结于此。当你需要保持势头并找到合适的抽象时,不必害怕复制代码。

It bears repeating: “Repeat yourself.”
需要重复强调: “重复你自己。”

  1. For some examples, see Ferris working on Rustendo64 or tokiospliff working on a C++ game engine.

  2. This is also how I write prose: I first write a draft and block my inner critic, and then I play the role of the editor/critic and “refactor” the text. This way, I get the best of both worlds: a quick feedback loop which doesn’t block my creativity, and a final product which is more polished and well-structured. Of course, I did not invent this approach. I recommend reading “Shitty first drafts” from Anne Lamott’s book Bird by Bird: Instructions on Writing and Life if you want to learn more about this technique.

  3. This is similar to the OODA loop concept, which stands for “Observe, Orient, Decide, Act.” It was developed by military strategist John Boyd. Fighter pilots use it to wait until the last responsible moment to decide on a course of action, which allows them to make the best decision based on the current situation and available information.

Good work takes time. If you want to build software that lasts, CodeCrafters teaches you to build things from scratch without the shortcuts. Try it free, get 40% off paid plans. I earn a commission on subscriptions.
做好工作需要时间。如果你想要开发经久耐用的软件,CodeCrafters 会教你从零开始,不走捷径。免费试用,付费方案享 40%折扣。我从中获得订阅佣金。